Is there a relationship between times of recession and other key indicators?

May 2016

Written by Zijie (Jason) Ou at NYU Stern, with a lot of help from google.

Contact: jason.ou@nyu.edu

Are past periods of economic downturn associated with other factors such as the national rate of unemployment, the fed funds rate, money supply, among other factors? I decided to take a closer look, and ultimately build a rough prediction model to determine (very roughly) when the next recession may be.

Packages Imported

Pandas, numpy, and matplotlib imported to handle basic dataframe/graphical analysis. I access FRED data through pandas.io.web.

I later use scipy to do some statistical tests, and multiple packages from sklearn to create a rough Logistic Regression model.


In [292]:
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
import pandas.io.data as web
import datetime
import numpy as np
from scipy import stats
from patsy import dmatrices
from sklearn.linear_model import LogisticRegression
from sklearn.cross_validation import train_test_split
from sklearn import metrics
from sklearn.cross_validation import cross_val_score

#start, end times used as parameters when pulling data from FRED.
start = datetime.datetime(1960, 1,1)
end= datetime.datetime(2016,5,11)

Importing the Data

An individual pandas dataframe is created for each imported data set.


In [293]:
gdp=web.DataReader("GDPC1",'fred',start,end)
#real GDP quarterly

gdpchange=web.DataReader("A191RL1Q225SBEA",'fred',start,end)
#real GDP change from preceding period quarterly

CPIurban=web.DataReader("CPIAUCSL",'fred',start,end)
#CPI quarterly, monthly, urban consumers

fedfunds=web.DataReader("FEDFUNDS",'fred',start,end)
#monthly fed funds rate

unemployment=web.DataReader("UNRATE",'fred',start,end)
#monthly unemployment rate

civpart=web.DataReader("CIVPART",'fred',start,end)
#monthly laborforce participation

tenyear=web.DataReader("GS10",'fred',start,end)
#monthly fed funds rate

m2v=web.DataReader("M2V",'fred',start,end)
#monthly m2 velocity

industrial=m2v=web.DataReader("INDPRO",'fred',start,end)
#monthly industrial production index

Fixing Datasets

The initial problem I encountered was that many of the datasets I collected from FRED did not share the same timescale, preventing any analysis. To fix this, I resampled each dataset to fall in line with the quarterly GDP reporting time scale, and truncated as necessary so all the dates matched.

In addition, for each dataset, I figured it would also be useful to add a new rate of change column for each dataset.


In [294]:
#Quarterly TimeGrouper
g=pd.TimeGrouper(freq='Q')

#Resampling monthly data into quarterly using the TimeGrouper
gdp3m=gdp.resample('Q',how='mean')

gdpchange3m=gdpchange.resample('Q',how='mean')
gdpchange3m=gdpchange3m[1:]
gdpchange3m.columns=['GDP_growth']

#For many of the datasets, including CPI, I created a new rate of change column.
CPIurban3m = CPIurban.groupby(g).mean()
CPIurban3m.columns=['CPI']
CPIurban3m['rate']=CPIurban3m['CPI'].pct_change()
CPIurban3m=CPIurban3m[1:]

unemployment3m=unemployment.groupby(g).mean()
unemployment3m.columns=['Unemployment Rate']
unemployment3m['Percent Change']=unemployment3m['Unemployment Rate'].pct_change()
unemployment3m=unemployment3m[1:]

fedfunds3m=fedfunds.groupby(g).mean()
fedfunds3m.columns=['FedFundsRate']
fedfunds3m['Percent Change']=fedfunds3m['FedFundsRate'].pct_change()
fedfunds3m=fedfunds3m[1:]

civpart3m=civpart.groupby(g).mean()
civpart3m.columns=['CivLabourPart']
civpart3m['Percent Change']=civpart3m['CivLabourPart'].pct_change()
civpart3m=civpart3m[1:]

tenyear3m=tenyear.groupby(g).mean()
tenyear3m.columns=['10YearBond']
tenyear3m['Percent Change']=tenyear3m['10YearBond'].pct_change()
tenyear3m=tenyear3m[1:]

m2v3m=m2v.groupby(g).mean()
m2v3m.columns=['M2Velocity']
m2v3m['Percent Change']=m2v3m['M2Velocity'].pct_change()
m2v3m=m2v3m[1:]

industrial3m=industrial.groupby(g).mean()
industrial3m.columns=['Production Index']
industrial3m['Percent Change']=industrial3m['Production Index'].pct_change()
industrial3m=industrial3m[1:]

Combining the Data Into a Single Dataset

Now that I cleaned up all of my datasets, they're ready to be combined.


In [295]:
final=gdpchange3m
final['CPI_growth']=CPIurban3m['rate']
final['Unemployment_Rate']=unemployment3m['Unemployment Rate']
final['Unemployment_Rate_growth']=unemployment3m['Percent Change']
final['Fed_Funds_Rate']=fedfunds3m['FedFundsRate']
final['Fed_Funds_Rate_growth']=fedfunds3m['Percent Change']
final['Civ_Labour_Part']=civpart3m['CivLabourPart']
final['Civ_Labour_Part_growth']=civpart3m['Percent Change']
final['Ten_Year_Bond']=tenyear3m['10YearBond']
final['Ten_Year_Bond_growth']=tenyear3m['Percent Change']
final['M2_Velocity']=m2v3m['M2Velocity']
final['M2_Velocity_growth']=m2v3m['Percent Change']
final['Production_Index_growth'] = industrial3m['Percent Change']

rowCount=len(final.index)

Final dataset looks good. No alignment issues here.


In [296]:
final


Out[296]:
GDP_growth CPI_growth Unemployment_Rate Unemployment_Rate_growth Fed_Funds_Rate Fed_Funds_Rate_growth Civ_Labour_Part Civ_Labour_Part_growth Ten_Year_Bond Ten_Year_Bond_growth M2_Velocity M2_Velocity_growth Production_Index_growth
DATE
1960-06-30 -1.5 0.006010 5.233333 0.019481 3.696667 -0.060169 59.566667 1.131862e-02 4.260000 -0.050520 24.360867 -0.021699 -0.021699
1960-09-30 1.0 0.000564 5.533333 0.057325 2.936667 -0.205591 59.566667 0.000000e+00 3.833333 -0.100156 23.960300 -0.016443 -0.016443
1960-12-31 -4.8 0.006421 6.266667 0.132530 2.296667 -0.217934 59.633333 1.119194e-03 3.886667 0.013913 23.382733 -0.024105 -0.024105
1961-03-31 2.7 0.002015 6.800000 0.085106 2.003333 -0.127721 59.633333 2.220446e-16 3.786667 -0.025729 23.028733 -0.015139 -0.015139
1961-06-30 7.6 -0.000335 7.000000 0.029412 1.733333 -0.134775 59.466667 -2.794857e-03 3.790000 0.000880 23.941700 0.039645 0.039645
1961-09-30 6.8 0.003911 6.766667 -0.033333 1.683333 -0.028846 59.200000 -4.484305e-03 3.980000 0.050132 24.705567 0.031905 0.031905
1961-12-31 8.3 0.001447 6.200000 -0.083744 2.400000 0.425743 59.000000 -3.378378e-03 3.973333 -0.001675 25.571933 0.035068 0.035068
1962-03-31 7.4 0.003890 5.633333 -0.091398 2.456667 0.023611 58.900000 -1.694915e-03 4.016667 0.010906 25.953867 0.014936 0.014936
1962-06-30 4.4 0.003764 5.533333 -0.017751 2.606667 0.061058 58.800000 -1.697793e-03 3.873333 -0.035685 26.205400 0.009692 0.009692
1962-09-30 3.9 0.002868 5.566667 0.006024 2.846667 0.092072 58.833333 5.668934e-04 3.990000 0.030120 26.484867 0.010664 0.010664
1962-12-31 1.6 0.002420 5.533333 -0.005988 2.923333 0.026932 58.533333 -5.099150e-03 3.903333 -0.021721 26.708433 0.008441 0.008441
1963-03-31 4.5 0.003182 5.766667 0.042169 2.966667 0.014823 58.600000 1.138952e-03 3.893333 -0.002562 27.202200 0.018487 0.018487
1963-06-30 5.3 0.001859 5.733333 -0.005780 2.963333 -0.001124 58.700000 1.706485e-03 3.963333 0.017979 27.919533 0.026370 0.026370
1963-09-30 8.0 0.006114 5.500000 -0.040698 3.330000 0.123735 58.633333 -1.135718e-03 4.033333 0.017662 28.105833 0.006673 0.006673
1963-12-31 2.9 0.002713 5.566667 0.012121 3.453333 0.037037 58.700000 1.137010e-03 4.120000 0.021488 28.580900 0.016903 0.016903
1964-03-31 8.9 0.004112 5.466667 -0.017964 3.463333 0.002896 58.700000 2.220446e-16 4.180000 0.014563 28.972200 0.013691 0.013691
1964-06-30 4.8 0.001617 5.200000 -0.048780 3.490000 0.007700 58.966667 4.542873e-03 4.200000 0.004785 29.652267 0.023473 0.023473
1964-09-30 5.5 0.002260 5.000000 -0.038462 3.456667 -0.009551 58.633333 -5.652911e-03 4.193333 -0.001587 30.127367 0.016022 0.016022
1964-12-31 1.4 0.004616 4.966667 -0.006667 3.576667 0.034716 58.566667 -1.137010e-03 4.173333 -0.004769 30.583867 0.015152 0.015152
1965-03-31 10.2 0.003099 4.900000 -0.013423 3.973333 0.110904 58.666667 1.707456e-03 4.203333 0.007188 31.739000 0.037769 0.037769
1965-06-30 5.6 0.006392 4.666667 -0.047619 4.076667 0.026007 58.866667 3.409091e-03 4.206667 0.000793 32.474933 0.023187 0.023187
1965-09-30 8.4 0.002964 4.366667 -0.064286 4.073333 -0.000818 58.900000 5.662514e-04 4.246667 0.009509 33.155000 0.020941 0.020941
1965-12-31 9.8 0.005277 4.100000 -0.061069 4.166667 0.022913 58.900000 0.000000e+00 4.473333 0.053375 33.825767 0.020231 0.020231
1966-03-31 10.2 0.009344 3.866667 -0.056911 4.556667 0.093600 58.866667 -5.659310e-04 4.770000 0.066319 34.794600 0.028642 0.028642
1966-06-30 1.6 0.009049 3.833333 -0.008621 4.913333 0.078274 59.033333 2.831257e-03 4.780000 0.002096 35.521233 0.020884 0.020884
1966-09-30 2.9 0.008659 3.766667 -0.017391 5.410000 0.101085 59.233333 3.387916e-03 5.140000 0.075314 36.070833 0.015472 0.015472
1966-12-31 3.5 0.008176 3.700000 -0.017699 5.563333 0.028343 59.466667 3.939223e-03 5.003333 -0.026589 36.415533 0.009556 0.009556
1967-03-31 3.7 0.002534 3.833333 0.036036 4.823333 -0.133014 59.300000 -2.802691e-03 4.583333 -0.083944 36.214900 -0.005510 -0.005510
1967-06-30 0.3 0.006067 3.833333 0.000000 3.990000 -0.172771 59.433333 2.248454e-03 4.820000 0.051636 36.067300 -0.004076 -0.004076
1967-09-30 3.5 0.010050 3.800000 -0.008696 3.893333 -0.024227 59.666667 3.925967e-03 5.246667 0.088520 36.315367 0.006878 0.006878
... ... ... ... ... ... ... ... ... ... ... ... ... ...
2008-12-31 -8.2 -0.022902 6.866667 0.144444 0.506667 -0.738832 65.900000 -2.522704e-03 3.253333 -0.157895 96.012400 -0.042607 -0.042607
2009-03-31 -5.4 -0.006879 8.266667 0.203883 0.183333 -0.638158 65.700000 -3.034901e-03 2.736667 -0.158811 90.647000 -0.055882 -0.055882
2009-06-30 -0.5 0.005318 9.300000 0.125000 0.180000 -0.018182 65.700000 0.000000e+00 3.313333 0.210719 87.964700 -0.029591 -0.029591
2009-09-30 1.3 0.008604 9.633333 0.035842 0.156667 -0.129630 65.333333 -5.580923e-03 3.516667 0.061368 89.203767 0.014086 0.014086
2009-12-31 3.9 0.007829 9.933333 0.031142 0.120000 -0.234043 64.866667 -7.142857e-03 3.460000 -0.016114 90.591767 0.015560 0.015560
2010-03-31 1.7 0.001585 9.833333 -0.010067 0.133333 0.111111 64.866667 0.000000e+00 3.716667 0.074181 92.311100 0.018979 0.018979
2010-06-30 3.9 -0.000353 9.633333 -0.020339 0.193333 0.450000 64.900000 5.138746e-04 3.490000 -0.060987 94.226800 0.020753 0.020753
2010-09-30 2.7 0.002931 9.466667 -0.017301 0.186667 -0.034483 64.633333 -4.108885e-03 2.786667 -0.201528 95.580833 0.014370 0.014370
2010-12-31 2.5 0.008097 9.500000 0.003521 0.186667 0.000000 64.433333 -3.094379e-03 2.863333 0.027512 95.944100 0.003801 0.003801
2011-03-31 -1.5 0.010672 9.033333 -0.049123 0.156667 -0.160714 64.166667 -4.138645e-03 3.460000 0.208382 96.426700 0.005030 0.005030
2011-06-30 2.9 0.011370 9.066667 0.003690 0.093333 -0.404255 64.100000 -1.038961e-03 3.210000 -0.072254 96.637900 0.002190 0.002190
2011-09-30 0.8 0.006521 9.000000 -0.007353 0.083333 -0.107143 64.100000 0.000000e+00 2.426667 -0.244029 97.619967 0.010162 0.010162
2011-12-31 4.6 0.004489 8.633333 -0.040741 0.073333 -0.120000 64.066667 -5.200208e-04 2.046667 -0.156593 98.454833 0.008552 0.008552
2012-03-31 2.7 0.005856 8.266667 -0.042471 0.103333 0.409091 63.766667 -4.682622e-03 2.036667 -0.004886 99.331467 0.008904 0.008904
2012-06-30 1.9 0.002019 8.200000 -0.008065 0.153333 0.483871 63.733333 -5.227392e-04 1.823333 -0.104746 99.958533 0.006313 0.006313
2012-09-30 0.5 0.004067 8.033333 -0.020325 0.143333 -0.065217 63.633333 -1.569038e-03 1.643333 -0.098720 100.037333 0.000788 0.000788
2012-12-31 0.1 0.007107 7.800000 -0.029046 0.160000 0.116279 63.700000 1.047669e-03 1.706667 0.038540 100.672633 0.006351 0.006351
2013-03-31 1.9 0.003967 7.733333 -0.008547 0.143333 -0.104167 63.433333 -4.186290e-03 1.950000 0.142578 101.356567 0.006794 0.006794
2013-06-30 1.1 -0.001221 7.533333 -0.025862 0.116667 -0.186047 63.400000 -5.254861e-04 1.996667 0.023932 101.668733 0.003080 0.003080
2013-09-30 3.0 0.005074 7.300000 -0.030973 0.083333 -0.285714 63.266667 -2.103049e-03 2.710000 0.357262 101.907267 0.002346 0.002346
2013-12-31 3.8 0.004597 6.933333 -0.050228 0.086667 0.040000 62.900000 -5.795574e-03 2.746667 0.013530 102.717733 0.007953 0.007953
2014-03-31 -0.9 0.005830 6.666667 -0.038462 0.073333 -0.153846 63.033333 2.119767e-03 2.763333 0.006068 103.281200 0.005486 0.005486
2014-06-30 4.6 0.004753 6.166667 -0.075000 0.093333 0.272727 62.800000 -3.701745e-03 2.623333 -0.050663 104.672433 0.013470 0.013470
2014-09-30 4.3 0.002278 6.133333 -0.005405 0.090000 -0.035714 62.866667 1.061571e-03 2.496667 -0.048285 105.327067 0.006254 0.006254
2014-12-31 2.1 -0.000781 5.700000 -0.070652 0.100000 0.111111 62.833333 -5.302227e-04 2.280000 -0.086782 106.283033 0.009076 0.009076
2015-03-31 0.6 -0.007237 5.566667 -0.023392 0.110000 0.100000 62.800000 -5.305040e-04 1.966667 -0.137427 105.787733 -0.004660 -0.004660
2015-06-30 3.9 0.006043 5.400000 -0.029940 0.123333 0.121212 62.700000 -1.592357e-03 2.166667 0.101695 105.053000 -0.006945 -0.006945
2015-09-30 2.0 0.003420 5.166667 -0.043210 0.136667 0.108108 62.533333 -2.658161e-03 2.220000 0.024615 105.453667 0.003814 0.003814
2015-12-31 1.4 0.001916 5.000000 -0.032258 0.160000 0.170732 62.533333 0.000000e+00 2.190000 -0.013514 104.583167 -0.008255 -0.008255
2016-03-31 0.5 -0.000781 4.933333 -0.013333 0.360000 1.250000 62.866667 5.330490e-03 1.920000 -0.123288 103.992533 -0.005647 -0.005647

224 rows × 13 columns

For the sake of analysis, I realized I would also need to create a similar dataset where each row corresponds to a recession year. This way, I could compare each indicator's performance across all years with performance only in recession years.


In [297]:
finalOnlyNegatives = pd.DataFrame(np.nan, index=[0], columns=final.columns)
for index, row in final.iterrows():
    if(row['GDP_growth'] < 0):
        finalOnlyNegatives.loc[finalOnlyNegatives.index.max()+1]=row[0],row[1],row[2],row[3],row[4],row[5],row[6],row[7],row[8],row[9],row[10],row[11],row[12]
finalOnlyNegatives=finalOnlyNegatives[1:]

finalRowCount=len(finalOnlyNegatives.index)

finalOnlyNegatives


Out[297]:
GDP_growth CPI_growth Unemployment_Rate Unemployment_Rate_growth Fed_Funds_Rate Fed_Funds_Rate_growth Civ_Labour_Part Civ_Labour_Part_growth Ten_Year_Bond Ten_Year_Bond_growth M2_Velocity M2_Velocity_growth Production_Index_growth
1 -1.5 0.006010 5.233333 0.019481 3.696667 -0.060169 59.566667 1.131862e-02 4.260000 -0.050520 24.360867 -0.021699 -0.021699
2 -4.8 0.006421 6.266667 0.132530 2.296667 -0.217934 59.633333 1.119194e-03 3.886667 0.013913 23.382733 -0.024105 -0.024105
3 -1.7 0.015343 3.566667 0.000000 8.940000 -0.004824 60.266667 5.534034e-04 7.296667 0.064171 40.377767 -0.006269 -0.006269
4 -0.7 0.016000 4.166667 0.168224 8.573333 -0.041014 60.466667 3.318584e-03 7.366667 0.009593 39.399533 -0.024227 -0.024227
5 -4.0 0.014518 5.833333 0.129032 5.566667 -0.169567 60.400000 1.658375e-03 6.853333 -0.081323 38.198500 -0.021618 -0.021618
6 -2.2 0.019727 4.800000 -0.027027 10.560000 0.350959 60.800000 5.485464e-04 7.206667 0.058766 46.958500 0.008727 0.008727
7 -3.3 0.029753 5.133333 0.076923 9.323333 -0.067356 61.333333 3.818876e-03 7.053333 0.044423 47.236600 -0.008785 -0.008785
8 -3.8 0.028140 5.633333 0.083333 12.090000 0.074667 61.333333 2.724796e-03 7.963333 0.055678 47.076800 -0.004123 -0.004123
9 -1.6 0.030708 6.600000 0.171598 9.346667 -0.226909 61.266667 -1.086957e-03 7.670000 -0.036835 45.201900 -0.039826 -0.039826
10 -4.7 0.021373 8.266667 0.252525 6.303333 -0.325606 61.200000 -1.088139e-03 7.540000 -0.016949 42.208733 -0.066218 -0.066218
11 -7.9 0.033741 7.333333 0.164021 12.686667 -0.156845 63.800000 -1.564945e-03 10.476667 -0.125973 51.645267 -0.042098 -0.042098
12 -0.6 0.018768 7.666667 0.045455 9.836667 -0.224645 63.700000 -1.567398e-03 10.953333 0.045498 50.804300 -0.016284 -0.016284
13 -2.9 0.020849 7.400000 -0.004484 17.780000 0.073024 64.066667 1.563314e-03 13.750000 0.060957 53.055867 0.003102 0.003102
14 -4.6 0.016257 8.233333 0.112613 13.586667 -0.227005 63.766667 1.046572e-03 14.086667 -0.051190 52.350967 -0.022294 -0.022294
15 -6.5 0.008887 8.833333 0.072874 14.226667 0.047105 63.766667 2.220446e-16 14.293333 0.014671 51.277833 -0.020499 -0.020499
16 -1.4 0.017367 9.900000 0.049470 11.006667 -0.241617 64.066667 1.041667e-03 13.116667 -0.058387 49.937733 -0.013907 -0.013907
17 -3.4 0.016979 6.133333 0.076023 7.743333 -0.051062 66.400000 -1.003009e-03 8.396667 -0.035236 64.297967 -0.015611 -0.015611
18 -1.9 0.007476 6.600000 0.076087 6.426667 -0.170039 66.233333 -2.510040e-03 8.016667 -0.045256 63.078900 -0.018960 -0.018960
19 -1.1 0.009566 4.233333 0.085470 5.593333 -0.135942 67.166667 3.986049e-03 5.050000 -0.092814 94.891133 -0.013423 -0.013423
20 -1.3 0.002823 4.833333 0.098485 3.496667 -0.191834 66.700000 -9.985022e-04 4.980000 -0.055028 92.362833 -0.014306 -0.014306
21 -2.7 0.010832 5.000000 0.041667 3.176667 -0.293551 66.100000 2.527806e-03 3.663333 -0.140063 105.086533 -0.004278 -0.004278
22 -1.9 0.015419 6.000000 0.125000 1.940000 -0.070288 66.066667 5.047956e-04 3.863333 -0.006003 100.285300 -0.032109 -0.032109
23 -8.2 -0.022902 6.866667 0.144444 0.506667 -0.738832 65.900000 -2.522704e-03 3.253333 -0.157895 96.012400 -0.042607 -0.042607
24 -5.4 -0.006879 8.266667 0.203883 0.183333 -0.638158 65.700000 -3.034901e-03 2.736667 -0.158811 90.647000 -0.055882 -0.055882
25 -0.5 0.005318 9.300000 0.125000 0.180000 -0.018182 65.700000 0.000000e+00 3.313333 0.210719 87.964700 -0.029591 -0.029591
26 -1.5 0.010672 9.033333 -0.049123 0.156667 -0.160714 64.166667 -4.138645e-03 3.460000 0.208382 96.426700 0.005030 0.005030
27 -0.9 0.005830 6.666667 -0.038462 0.073333 -0.153846 63.033333 2.119767e-03 2.763333 0.006068 103.281200 0.005486 0.005486

Analysis

Look good. Out of 224 recorded quarters, there are only 27 quarters of negative GDP growth.

Now lets dive deeper and see how the indicators differ in times of recession and across all quarters. The methodology is as follows:

For each indicator, determine the % of the time where growth is positive from the initial dataframe containing all records. This is where the rate of change column I created for each indicator comes in handy.

I can then compare this percentage with the percentage of the time where the indicator's growth is positive from the negative-GDP-growth-only data frame.

For the sake of statistical soundness, I also perform a 2-tailed Proportion p-test to determine which differences are statistically significant.


In [298]:
#cpi % positive growth across all quarters, compared with 
CPIcount=0
for index, row in final.iterrows():
    if(row['CPI_growth'] > 0):
        CPIcount=CPIcount+1
CPIpercent = (CPIcount/rowCount)

#CPI % positive growth across negative GDP growth quarters.
newCPICount = 0
for index, row in finalOnlyNegatives.iterrows():
    if(row['CPI_growth'] > 0):
        newCPICount=newCPICount+1
newCPIpercent = (newCPICount/finalRowCount)

#CPI 2-tailed Proportion p-test of significance
CPIz=np.array([[rowCount,CPIcount],[finalRowCount,newCPICount]])
CPIp=(stats.chi2_contingency(CPIz)[1])

#The same is done for each of the indicators

#unemployment rate growth average and negative-only 
unemploymentCount=0
for index, row in final.iterrows():
    if(row['Unemployment_Rate_growth'] > 0):
        unemploymentCount=unemploymentCount+1
unemploymentPercent = (unemploymentCount/rowCount)

newUnemploymentCount = 0
for index, row in finalOnlyNegatives.iterrows():
    if(row['Unemployment_Rate_growth'] > 0):
        newUnemploymentCount=newUnemploymentCount+1
newUnemploymentPercent = (newUnemploymentCount/finalRowCount)
    
unemploymentz=np.array([[rowCount,unemploymentCount],[finalRowCount,newUnemploymentCount]])
unemploymentp=(stats.chi2_contingency(unemploymentz)[1])

#fed funds rate average and negative-only 
fedfundscount=0
for index, row in final.iterrows():
    if(row['Fed_Funds_Rate_growth'] > 0):
        fedfundscount=fedfundscount+1
fedfundspercent = (fedfundscount/rowCount)

newfedfundscount = 0
for index, row in finalOnlyNegatives.iterrows():
    if(row['Fed_Funds_Rate_growth'] > 0):
        newfedfundscount=newfedfundscount+1
newFedFundsPercent = (newfedfundscount/finalRowCount)
    
fedfundsz=np.array([[rowCount,fedfundscount],[finalRowCount,newfedfundscount]])
fedfundsp=(stats.chi2_contingency(fedfundsz)[1])

#civ labour participation rate average and negative-only 
civlaborcount=0
for index, row in final.iterrows():
    if(row['Civ_Labour_Part_growth'] > 0):
        civlaborcount=civlaborcount+1
civlabourpercent = (civlaborcount/rowCount)

newcivlaborcount = 0
for index, row in finalOnlyNegatives.iterrows():
    if(row['Civ_Labour_Part_growth'] > 0):
        newcivlaborcount=newcivlaborcount+1
newcivlaborpercent = (newcivlaborcount/finalRowCount)
    
civlabourz=np.array([[rowCount,civlaborcount],[finalRowCount,newcivlaborcount]])
civlabourp=(stats.chi2_contingency(civlabourz)[1])

#ten year bond rate average and negative-only 
tenyearcount=0
for index, row in final.iterrows():
    if(row['Ten_Year_Bond_growth'] > 0):
        tenyearcount=tenyearcount+1
tenyearpercent = (tenyearcount/rowCount)

newtenyearcount = 0
for index, row in finalOnlyNegatives.iterrows():
    if(row['Ten_Year_Bond_growth'] > 0):
        newtenyearcount=newtenyearcount+1
newtenyearpercent = (newtenyearcount/finalRowCount)
    
tenyearz=np.array([[rowCount,tenyearcount],[finalRowCount,newtenyearcount]])
tenyearp=(stats.chi2_contingency(tenyearz)[1])

#M2 velocity average and negative-only 
m2count=0
for index, row in final.iterrows():
    if(row['M2_Velocity_growth'] > 0):
        m2count=m2count+1
m2percent = (m2count/rowCount)

newm2count = 0
for index, row in finalOnlyNegatives.iterrows():
    if(row['M2_Velocity_growth'] > 0):
        newm2count=newm2count+1
newm2percent = (newm2count/finalRowCount)
    
m2z=np.array([[rowCount,m2count],[finalRowCount,newm2count]])
m2p=(stats.chi2_contingency(m2z)[1])

#Production Index Growth average and negative-only 
productioncount=0
for index, row in final.iterrows():
    if(row['Production_Index_growth'] > 0):
        productioncount=productioncount+1
productionpercent = (productioncount/rowCount)

newproductioncount = 0
for index, row in finalOnlyNegatives.iterrows():
    if(row['Production_Index_growth'] > 0):
        newproductioncount=newproductioncount+1
newproductionpercent = (newproductioncount/finalRowCount)
    
productionz=np.array([[rowCount,productioncount],[finalRowCount,newproductioncount]])
productionp=(stats.chi2_contingency(productionz)[1])

Now let's create a new dataframe to store the results.


In [299]:
beforeAfterCompare = pd.DataFrame(np.nan, index=[0], columns=final.columns)
beforeAfterCompare=beforeAfterCompare.drop('GDP_growth',axis=1)
beforeAfterCompare=beforeAfterCompare.drop('Fed_Funds_Rate',axis=1)
beforeAfterCompare=beforeAfterCompare.drop('Unemployment_Rate',axis=1)
beforeAfterCompare=beforeAfterCompare.drop('Civ_Labour_Part',axis=1)
beforeAfterCompare=beforeAfterCompare.drop('Ten_Year_Bond',axis=1)
beforeAfterCompare=beforeAfterCompare.drop('M2_Velocity',axis=1)
beforeAfterCompare['Type']=""

beforeAfterCompare.loc[0]=CPIpercent,unemploymentPercent,fedfundspercent,civlabourpercent,tenyearpercent,m2percent,productionpercent,'All Quarters'
beforeAfterCompare.loc[1]=newCPIpercent,newUnemploymentPercent,newFedFundsPercent,newcivlaborpercent,newtenyearpercent,newm2percent,newproductionpercent,'Recession Quarters'

beforeAfterCompare=beforeAfterCompare.set_index('Type')
beforeAfterCompare


Out[299]:
CPI_growth Unemployment_Rate_growth Fed_Funds_Rate_growth Civ_Labour_Part_growth Ten_Year_Bond_growth M2_Velocity_growth Production_Index_growth
Type
All Quarters 0.946429 0.348214 0.508929 0.508929 0.526786 0.767857 0.767857
Recession Quarters 0.925926 0.814815 0.148148 0.592593 0.444444 0.148148 0.148148

Looks good! With this dataframe we can clearly identify some indicators whose positive growth rates differs significantly when GDP growth is negative and across all quarters.

Namely, the unemployment rate increases only 34% of the time across all periods, but increases 81% of the time in times of recession. Other factors, such as the fed funds rate and the M2 Velocity also differ substantially.

Other factors differ moderately, or barely at all such as with CPI growth.

Just to confirm that these changes are significant, lets see what the p-values are.


In [300]:
significance = pd.DataFrame(np.nan, index=[0], columns=beforeAfterCompare.columns)

significance.loc[0]=CPIp,unemploymentp,fedfundsp,civlabourp,tenyearp,m2p,productionp,
significance


Out[300]:
CPI_growth Unemployment_Rate_growth Fed_Funds_Rate_growth Civ_Labour_Part_growth Ten_Year_Bond_growth M2_Velocity_growth Production_Index_growth
0 0.942456 0.010091 0.029397 0.777339 0.773569 0.001711 0.001711

Looks good. This confirms that those factors that differ substantially have low p-values, which signifies statistical significance. (Under .05 is a good cut off here.)

Visualization

Lets visualize this analysis.


In [301]:
fig, ax = plt.subplots(figsize=(18,10))

beforeAfterCompare.plot(ax=ax,kind='bar',title='Differences in Key Indicators During Recession')


Out[301]:
<matplotlib.axes._subplots.AxesSubplot at 0x1186362b0>

In addition to comparing the percentage of the time where the indicator experiences positive growth, we can also compare the average value of the indicator across all time periods vs periods of negative GDP growth.

This analysis is split into two graphs in order to better see the differences.


In [302]:
final=final.reset_index()

final['Recession']=(final.GDP_growth < 0).astype(float)
finalvalues=final.groupby('Recession').mean()

finalvalues=finalvalues.drop("GDP_growth",axis=1)
finalvalues=finalvalues.drop("CPI_growth",axis=1)
finalvalues=finalvalues.drop("Unemployment_Rate_growth",axis=1)
finalvalues=finalvalues.drop("Fed_Funds_Rate_growth",axis=1)
finalvalues=finalvalues.drop("Civ_Labour_Part_growth",axis=1)
finalvalues=finalvalues.drop("Ten_Year_Bond_growth",axis=1)
finalvalues=finalvalues.drop("M2_Velocity_growth",axis=1)
finalvalues=finalvalues.drop("Production_Index_growth",axis=1)
finalvalues2=finalvalues
finalvalues=finalvalues.drop("Civ_Labour_Part",axis=1)
finalvalues=finalvalues.drop("M2_Velocity",axis=1)

finalvalues


Out[302]:
Unemployment_Rate Fed_Funds_Rate Ten_Year_Bond
Recession
0 6.023858 5.015939 6.188088
1 6.585185 6.862840 7.158148

In [303]:
fig, ax = plt.subplots(figsize=(18,10))

finalvalues.plot(ax=ax,kind='bar',title='Differences in Key Indicators During Recession')


Out[303]:
<matplotlib.axes._subplots.AxesSubplot at 0x115156630>

In [304]:
finalvalues2=finalvalues2.drop("Unemployment_Rate",axis=1)
finalvalues2=finalvalues2.drop("Fed_Funds_Rate",axis=1)
finalvalues2=finalvalues2.drop("Ten_Year_Bond",axis=1)
finalvalues2


Out[304]:
Civ_Labour_Part M2_Velocity
Recession
0 63.718613 66.263488
1 63.429630 62.881799

In [305]:
fig, ax = plt.subplots(figsize=(18,10))

finalvalues2.plot(ax=ax,kind='bar',title='Differences in Key Indicators During Recession',ylim=(50,70))


Out[305]:
<matplotlib.axes._subplots.AxesSubplot at 0x118b5b588>

Finally, a Prediction Model

I created a train/test logistic regression model using SKlearn, which ultimately resulted in a moderately successful model. Since negative gdp growth occured in only 12% of periods, the null success rate here is 88%. If one guessed that every period would have no recession, they would be right 88% of the time.

This model is 92.6% accurate, which is little more than a third better than just guessing. This goes to show that it is difficult to create a prediction model for recessions from these indicators alone. Still, its better than nothing.


In [306]:
y, X = dmatrices('Recession ~ CPI_growth + Unemployment_Rate + Unemployment_Rate_growth + Fed_Funds_Rate + Civ_Labour_Part + Ten_Year_Bond+ M2_Velocity', final,return_type="dataframe")

In [307]:
y = np.ravel(y)

In [308]:
#this is the % of the time negative gdp growth occured.
y.mean()


Out[308]:
0.12053571428571429

In [309]:
#creating test, train data sets, with test size of 30%.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.3, random_state=2)
model = LogisticRegression()
model.fit(X_train, y_train)


Out[309]:
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0, warm_start=False)

In [310]:
# predict class labels for the test set
predicted = model.predict(X_test)
print (predicted)
#making sure there is an occurance of negative GDP growth in the test set.


[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]

In [315]:
# This is the accuracy, 92.6%
print (metrics.accuracy_score(y_test, predicted))


0.926470588235

Conclusion

Predicting GDP decline is hard stuff! Its obvious that there is a correlation with certain indicators, such as unemployment, the fed funds rate, and M2 velocity, but this is still just correlation.

Still, combining the moderately successful model with an awareness of the current status of these indicators would be a pretty good bet. If the model is predicting a period of negative GDP growth, and the correlated indicators are also in line with recession values, its likely a recession is right around the corner.

Data Sources

FRED